AITopics | planning mechanism

Credit assignment, i.e. determining how to correctly associate delayed rewards with states or state-action pairs, is a crucial problem for reinforcement learning (RL) agents ( Sutton and Barto, 2018).

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Plan, Attend, Generate: Planning for Sequence-to-Sequence Models

Caglar Gulcehre, Francis Dutil, Adam Trischler, Yoshua Bengio

Neural Information Processing SystemsOct-4-2024, 02:47:35 GMT

We investigate the integration of a planning mechanism into sequence-to-sequence models using attention. We develop a model which can plan ahead in the future when it computes its alignments between input and output sequences, constructing a matrix of proposed future alignments and a commitment vector that governs whether to follow or recompute the plan. This mechanism is inspired by the recently proposed strategic attentive reader and writer (STRAW) model for Reinforcement Learning. Our proposed model is end-to-end trainable using primarily differentiable operations. We show that it outperforms a strong baseline on character-level translation tasks from WMT'15, the algorithmic task of finding Eulerian circuits of graphs, and question generation from the text. Our analysis demonstrates that the model computes qualitatively intuitive alignments, converges faster than the baselines, and achieves superior performance with fewer parameters.

alignment, arxiv preprint arxiv, sequence, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

Add feedback

Plan, Attend, Generate: Planning for Sequence-to-Sequence Models

Gulcehre, Caglar, Dutil, Francis, Trischler, Adam, Bengio, Yoshua

Neural Information Processing SystemsDec-31-2017

We investigate the integration of a planning mechanism into sequence-to-sequence models using attention. We develop a model which can plan ahead in the future when it computes its alignments between input and output sequences, constructing a matrix of proposed future alignments and a commitment vector that governs whether to follow or recompute the plan. This mechanism is inspired by the recently proposed strategic attentive reader and writer (STRAW) model for Reinforcement Learning. Our proposed model is end-to-end trainable using primarily differentiable operations. We show that it outperforms a strong baseline on character-level translation tasks from WMT'15, the algorithmic task of finding Eulerian circuits of graphs, and question generation from the text. Our analysis demonstrates that the model computes qualitatively intuitive alignments, converges faster than the baselines, and achieves superior performance with fewer parameters.

machine learning, natural language, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

Add feedback

Plan, Attend, Generate: Planning for Sequence-to-Sequence Models

Dutil, Francis, Gulcehre, Caglar, Trischler, Adam, Bengio, Yoshua

arXiv.org Machine LearningNov-28-2017

We investigate the integration of a planning mechanism into sequence-to-sequence models using attention. We develop a model which can plan ahead in the future when it computes its alignments between input and output sequences, constructing a matrix of proposed future alignments and a commitment vector that governs whether to follow or recompute the plan. This mechanism is inspired by the recently proposed strategic attentive reader and writer (STRAW) model for Reinforcement Learning. Our proposed model is end-to-end trainable using primarily differentiable operations. We show that it outperforms a strong baseline on character-level translation tasks from WMT'15, the algorithmic task of finding Eulerian circuits of graphs, and question generation from the text. Our analysis demonstrates that the model computes qualitatively intuitive alignments, converges faster than the baselines, and achieves superior performance with fewer parameters.

machine learning, natural language, reinforcement learning, (18 more...)

arXiv.org Machine Learning

1711.10462

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Technology: